EDGEWOOD  CHEMICAL  BIOLOGICAL  CENTER 

U.S.  ARMY  RESEARCH,  DEVELOPMENT  AND  ENGINEERING  COMMAND 
Aberdeen  Proving  Ground,  MD  21010-5424 

ECBC-TN-068 


MODIFIED  MAXIMUM  LIKELIHOOD  ESTIMATION  METHOD 
FOR  COMPLETELY  SEPARATED  AND  QUASI-COMPLETELY 
SEPARATED  DATA  FOR  A  DOSE-RESPONSE  MODEL 


Kyong  H.  Park 
Steven  J.  Lagan 

RESEARCH  AND  TECHNOLOGY  DIRECTORATE 


August  2015 


Approved  for  public  release;  distribution  is  unlimited. 


Disclaimer 


The  findings  in  this  report  are  not  to  be  construed  as  an  official  Department  of  the  Army 
position  unless  so  designated  by  other  authorizing  documents. 


REPORT  DOCUMENTATION  PAGE 


Form  Approved 
OMB  No.  0704-0188 


Public  reporting  burden  for  this  collection  of  information  is  estimated  to  average  1  h  per  response,  including  the  time  for  reviewing  instructions,  searching  existing  data  sources,  gathering  and 
maintaining  the  data  needed,  and  completing  and  reviewing  this  collection  of  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspect  of  this  collection  of  information, 
including  suggestions  for  reducing  this  burden  to  Department  of  Defense,  Washington  Headquarters  Services,  Directorate  for  Information  Operations  and  Reports  (0704-0188),  1215  Jefferson 
Davis  Highway,  Suite  1204,  Arlington,  VA  22202-4302.  Respondents  should  be  aware  that  notwithstanding  any  other  provision  of  law,  no  person  shall  be  subject  to  any  penalty  for  failing  to 
comply  with  a  collection  of  information  if  it  does  not  display  a  currently  valid  OMB  control  number.  PLEASE  DO  NOT  RETURN  YOUR  FORM  TO  THE  ABOVE  ADDRESS. 


1 .  REPORT  DATE  (DD-MM-YYYY)  2.  REPORT  TYPE  3.  DATES  COVERED  (From  -  To) 

XX-08-2015  Final  Jun  2014  -  Aug  2014 


4.  TITLE  AND  SUBTITLE 

Modified  Maximum  Likelihood  Estimation  Method  for  Completely 
Separated  and  Quasi-Completely  Separated  Data  for  a  Dose-Response 
Model 


5a.  CONTRACT  NUMBER 


5b.  GRANT  NUMBER 


5c.  PROGRAM  ELEMENT  NUMBER 


6.  AUTHOR(S) 

Park,  Kyong  H.;  and  Lagan,  Steven  J. 


5d.  PROJECT  NUMBER 

CB10128 


5e.  TASK  NUMBER 


5f.  WORK  UNIT  NUMBER 


7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 

Director,  ECBC,  ATTN:  RDCB-DRI-M,  APG,  MD  21010-5424 


9.  SPONSORING  /  MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 

Defense  Threat  Reduction  Agency,  8725  John  J.  Kingman  Road,  MSC 
6201,  Fort  Belvoir,  VA  22060-620 1 


8.  PERFORMING  ORGANIZATION  REPORT 
NUMBER 

ECBC-TN-068 


10.  SPONSOR/MONITOR’S  ACRONYM(S) 

DTRA 


11.  SPONSOR/MONITOR’S  REPORT  NUMBER(S) 


12.  DISTRIBUTION  /  AVAILABILITY  STATEMENT 

Approved  for  public  release;  distribution  is  unlimited. 


13.  SUPPLEMENTARY  NOTES 


14.  ABSTRACT: 

When  data  are  completely  or  quasi-completely  separated,  the  traditional  maximum  likelihood  estimation  (MLE)  method 
generates  infinite  estimates.  The  bias-reduction  (BR)  method,  which  is  a  variant  of  the  bias-correction  method,  removes  the 
first-order  bias  term  by  applying  a  modified  score  function,  and  it  always  produces  finite  estimates.  By  comparison,  the 
traditional  MLE  method  is  unreliable  because  it  produces  infinite  estimates.  Unlike  the  Bayesian  method,  which  may  be 
difficult  to  apply  in  some  situations,  the  BR  method  does  not  require  prior  information.  The  purpose  of  this  paper  is  to 
provide  all  of  the  necessary  equations  and  procedures  required  to  carry  out  the  BR  method,  thereby  enabling  use  of  the 
method  on  common  computing  platforms  such  as  Basic  in  Microsoft  Excel  and  Minitab.  The  U.S.  Army  Edgewood 
Chemical  Biological  Center  members  have  been  using  related  probit  slopes  for  separated  data,  but  such  slopes  are  not 
always  known.  The  BR  method  is  a  useful  alternative  for  estimating  parameters  of  separated  data  sets  and  small-sample 
sizes. 


15.  SUBJECT  TERMS 

Modified  score  function  Logit 

Bias-reduction  (BR)  method  Bias-correction  (BC)  method 

Traditional  maximum  likelihood  estimation  (MLE)  method 

Probit 

Bayesian  method 

R  software 

16.  SECURITY  CLASSIFICATION  OF: 

17.  LIMITATION  OF 
ABSTRACT 

18.  NUMBER  OF 
PAGES 

19a.  NAME  OF  RESPONSIBLE  PERSON 

Renu  B.  Rastogi 

a.  REPORT 

b.  ABSTRACT 

c.  THIS  PAGE 

19b.  TELEPHONE  NUMBER  (include  area  code) 

u 

U 

u 

uu 

30 

(410) 436-7545 

Standard  Form  298  (Rev.  8-98) 
Prescribed  by  ANSI  Std.  Z39.18 


Blank 


11 


EXECUTIVE  SUMMARY 


Applying  a  generalized  linear  model  (GLM)  with  a  logit  or  probit  link  is  a  routine 
procedure  for  estimating  the  effective  or  lethal  dose  for  a  dose-response  model.  The  traditional 
maximum  likelihood  estimation  (MLE)  method  used  in  GLMs  generates  infinite  estimates 
whether  data  are  completely  separated  (CS)  or  quasi-completely  separated  ([quasi-CS]  i.e.,  when 
the  range  of  doses  for  one  of  the  responses  [0  or  1]  does  not  overlap  the  range  of  doses  for  the 
other  response,  or  when  it  overlaps  at  only  a  single-dose  level). 

The  bias-reduction  (BR)  method,  described  in  this  paper,  removes  the  first-order 
bias  tenn  by  applying  a  modified  score  function.  This  method  always  generates  finite  estimates, 
which  embody  all  of  the  properties  inherent  in  traditional  MLEs.  Unlike  the  Bayesian  method, 
which  may  not  be  practical  in  many  different  situations,  the  BR  method  does  not  need  prior 
information. 


The  purpose  of  this  paper  is  to  provide  all  the  necessary  formulas  and  procedures 
to  carry  out  the  BR  method  to  generate  finite  estimates.  This  method  could  be  used  in  common 
computing  platforms  (e.g.,  Microsoft  Excel  and  Minitab)  that  are  not  pre-programmed  to  execute 
the  BR  method.  U.S.  Army  Edgewood  Chemical  Biological  Center  members  have  been  using 
related  probit  slopes  when  dealing  with  separated  data,  but  the  BR  method  would  provide  an 
additional  option  in  cases  where  the  probit  slopes  are  not  known.  Additionally,  the  BR  method 
can  be  used  in  small  samples  to  reduce  first-order  asymptotic  bias  of  parameter  estimates.  The 
BR  method  was  written  in  R  software  (listed  in  Appendix  B),  and  parameter  estimates  were 
compared  between  the  BR  and  traditional  MLE  methods  for  three  different  data  sets  (i.e.,  CS, 
quasi-CS,  and  overlapped).  The  BR  method  produced  finite  confidence  intervals  (CIs)  for  all 
three  data  sets,  whereas  the  traditional  MLE  method  generated  infinite  CIs  for  the  two  separated 
data  sets. 
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MODIFIED  MAXIMUM  LIKELIHOOD  ESTIMATION  METHOD 
FOR  COMPLETELY  SEPARATED  AND  QUASI-COMPLETELY  SEPARATED  DATA 

FOR  A  DOSE-RESPONSE  MODEL 


1.  INTRODUCTION 

Maximum  likelihood  estimation  (MLE)  for  a  generalized  linear  model  (GLM)  is 
the  most  widely  used  method  for  estimating  parameters  for  dose-response  (DR)  experiments  with 
binary  random  0  and  1  response  variables.  For  DR  models,  a  log-likelihood  binomial 
distribution,  with  a  logit  or  probit  link,  is  a  default  method  in  statistical  software  for  estimating 
the  50th  percentile  of  the  lethal  dose  (LD50)  or  the  50th  percentile  of  the  effective  dose  (ED50) 
for  risk  estimates. 

The  MLE  method  does  not  apply  when  data  are  completely  separated  (CS)  or 
quasi-completely  separated  (quasi-CS)  between  binary  responses  (i.e.,  when  the  range  of  doses 
for  the  responses  [0  or  1]  does  not  overlap  or  overlaps  at  only  a  single-dose  level).  The 
conditions  of  existence  for  MLE  esthnates  are  discussed  in  several  papers,1,2  and  estimates  from 
the  MLE  method  go  to  infinity  for  the  CS  or  quasi-CS  binary  data.  Infinite  estimates  can  be 
considered  as  inaccurate  with  infinite  confidence  intervals  (CIs).  Unique  and  finite  estimates 
exist  when  data  are  overlapped  in  explanatory  variables  for  binary  responses.  Graphs  with  the 
three  different  data  distributions  are  shown  for  CS  doses  (Figure  1).  quasi-CS  doses  (Figure  2), 
and  overlapped  doses  (Figure  3). 


Figure  1.  CS  data  for  a  DR  model  with  binary  random  0  and  1  response  variables. 
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Figure  2.  Quasi-CS  data  for  a  DR  model  with  binary  random  0  and  1  response  variables 

(two  responses  overlap  at  dose  level  3.18). 
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Figure  3.  Overlapped  data  for  a  DR  model  with  binary  random  0  and  1  response  variables 
(two  responses  overlap  at  several  dose  levels). 

The  MLE  method  is  used  to  estimate  the  parameters  of  a  nonlinear  regression 
from  score  equations  =  0  (Section  2)  where  logZ  is  a  log-likelihood  function. 

d  Pi 

Firth  recommends  a  first-order  bias-reduction  (BR)  method  for  small-sample  sizes  based  on  a 
modified  score  equation.3  Firth’s  modified  MLE  (MMLE)  function  is  related  to  the  penalized 
log-likelihood  (PL)  function,  and  its  formulation  was  suggested  as  a  solution  for  the  CS  and 
quasi-CS  data.  This  equation  will  be  discussed  in  detail  in  Section  2. 

Other  methods  for  overcoming  the  problems  posed  by  the  CS  and  quasi-CS  data 
prevail,  such  as  incorporating  prior  information  to  augment  or  better  estimate  the  current  data.4 
However,  providing  previously  collected  data  is  not  always  practical  or  possible. 

When  calculating  LD50  or  ED50  values  for  chemical  warfare  and  toxic  industrial 
agent  human  risk  estimates  at  U.S.  Army  Edgewood  Chemical  Biological  Center  (ECBC),  probit 
links  (i.e.,  statistical  relationships)  have  traditionally  been  used  to  perform  analyses.  Probit 
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slopes  and  LD50  or  ED50  values  are  necessary  parameters  in  calculating  casualty  estimates  for 
downwind  hazardous  prediction  models  such  as  VLSTrack3  and  HPAC.b  Whenever  possible, 
ECBC  members  have  been  using  previously  known,  related  probit  slopes  for  the  CS  or  quasi-CS 
data  sets. 


Most  of  the  published  papers  that  were  researched  for  this  report  discussed  the 
MMLE  method  for  the  exponential  family  with  a  canonical  link  (EFCL)  such  as  logistic 
binomial  distribution.  Because  not  much  information  is  available  on  the  necessary  formulas  for 
estimating  CS  and  quasi-CS  data-probit-link  parameters,  the  purpose  of  this  paper  is  to  provide 
all  the  necessary  formulas  and  procedures  needed  for  estimation  of  finite  parameters.  These 
formulas  may  widen  the  application  of  the  method  to  the  other  platforms  that  do  not  have  built-in 
modules  to  carry  out  MMLEs. 

Section  2  discusses  MLE  properties  and  bias-correction  (BC)  derivations  of  first- 
order  asymptotic  MLE  bias.  The  BR  method  used  for  applying  modified  score  functions  and 
their  relationships  to  PL  functions  for  the  logistic  binomial  function  are  also  discussed.  The 
differences  between  the  BC  and  BR  methods  are  also  briefly  discussed,  and  general  equations  for 
the  BC  and  BR  methods  are  provided.  For  the  BR  method,  two  different  formulas  are  provided 
for  canonical  and  non-canonical  links.  Section  3  provides  the  estimation  process  for  applying  an 
iterative  reweighted  procedure  and  obtaining  initial  estimates  to  begin  the  procedure.  Section  4 
summarizes  the  methods  and  procedures  for  the  BR  method.  The  R  codes  ([R  Programming 
Language  and  Software]  Ross  Ihaka  and  Robert  Gentleman,  University  of  Auckland,  New 
Zealand),  which  are  used  to  estimate  an  intercept  and  a  slope  using  the  BR  method,  are  listed  in 
Appendix  B. 


2.  BACKGROUND  FOR  DERIVATION  OF  MODIFIED  SCORE  FUNCTION 

2.1  Properties  of  MLE  Values  in  GLM 

Generally,  GLM  is  composed  of  three  components:  (1)  The  random  component  of 
response  values,  y;  (2)  the  systematic  component  producing  a  linear  predictor,  6  —  V  Pu  and 
(3)  a  link  function  between  the  random  and  systematic  components,  6  —  g  (pi)  —  Xfi,  where  g(p) 
is  a  link  function.  The  random  component,  y,  with  the  exponential  family  (such  as  binomial 
distribution)  has  the  following  distributional  form:6 

,  r (yo-bm,  , 

/  =  expl  a(0)  +c(y’0)l  (1) 

with  distribution-specific  functions  a(.),  b(.),  and  c(.).  For  the  binomial  distribution,  the 
dispersion  parameter  is  a(cp)  =  1,  and  it  can  be  expressed  in  the  log  form  of  the  exponential 
formula: 


a  VLSTrack  is  Vapor,  Liquid,  and  Solid  Tracking. 
b  HP  AC  is  Hazard  Prediction  and  Assessment  Capability. 
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log  (/)  =  ylog( 


^  _  nl°g(1  -  ^  1  +  logCy) 


(2) 


The  canonical  link  derived  for  logit  from  eq  2  is  expressed  as 


9  =  log(7 


7T 


0 


(3) 


X1-7T) 

The  logit  model  fits  to  the  0  for  a  linear  model  and  transforms  back  to  n.  For  probit,  the  link  is 

9  =  F~\n )  (4) 

where  F(x^/?)  is  a  standard  cumulative  normal  distribution  expressed  as 


F(x 


rx'jP  1 

=  J-oo 


exp 


y‘ 


dy 


(5) 


The  MLE  method  is  used  to  estimate  regression  parameters  ftj  (j  =  1 , . . .  ,p)  by 
maximizing  the  likelihood  function  used  to  apply  first-order  derivatives  with  regard  to  parameter 
The  log-likelihood  function  for  the  probit  model  linked  with  the  binomial  distribution  is 


lnL=g{y,lni^^)  +  ln(1-F(x; 


(70) 


(6) 


where  yt  is  response  variables  0  and  1 .  The  first-order  derivatives  of  the  above  log-likelihood 
function  are 


din  (L(/?)) 


ft 

-I 


Wj  ^F(xjp)(l-F{xjp)) 


{yi-F{x'jp))xi 


(7) 


where  j  is  the /  '  explanatory  variable,  and  f(x)  denotes  a  standard  normal-density  function 

1  J*l\ 

f{x)  =  —;=&  V  2  / 


^^2n 


(8) 


The  estimates  are  calculated  leading  to  the  above  score  equations  as 


d\nL(p) 


U{Pj)  =  o 


(9) 


The  MLE  method  has  many  desirable  properties  for  large  samples.  For 
example,  MLE  ft  is  a  consistent  estimator  of  true  parameter  fio  with  the  probability  approaching  1 

p 

as  n  oo,  that  is, /?mie  ^ /?0'  The  consistent  solution  is  asymptotically  normal, 
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—  (B  —  Bq)  -*  Z  (i.e.,  B mie  converges  to  distribution  Z,  where  Z  is  a  standard  nonnal 

(7  v  y 

distribution,  and  asymptotic  variances  of  unbiased  estimators  follow  /(/?0)-1>  where  7(/?0)  is  the 
Fisher  infonnation  matrix).  The  information  matrix  for  the  probit  model  is  expressed  as 


ox'jB 

/(/?)  =  E( — — - 


CT/? 


-y 


-E 


(a2  log(xjp)\  =  rn  f2(x'jP) 

\  °V2  )  Ji=tF(xjp)(l-F(xjpy) 


(10) 


where  XX  is  an  explanatory  matrix  including  a  vector  of  1  for  an  intercept.  The  MLE  estimates 
are  asymptotically  unbiased  and  reach  the  Cramer-Rao  bound,7  which  may  be  expressed  as 


var(/?)  > 


1 

KPo) 


(11) 


2.2  BC  Method  Derivation 


For  CS  and  quasi-CS  data,  the  estimates  from  the  above  score  equations  go  to 
infinity  with  large  variances  exceeding  thousands,  which  makes  the  estimates  inaccurate  with 
CIs  reaching  infinity.  To  remedy  the  infinite  estimates,  the  score  function  is  modified  when 
derived  from  the  BC  method. 

The  score  function  computed  from  the  profile  likelihood  is  biased3  because  of  the 
combination  of  the  curvature  and  unbiased  nature  of  the  score  function  (i.e.,  0  expectation  of  the 
score  function  expressed  as  E{U  (/?0) }  -  0  at  the  true  value  of  [>). 

The  expected  estimators  of  MLE  vector  /3s  may  be  expressed  as 
E  (/?)  —  P0  +  ~~~  +  — »  where  [h  is  the  true  unknown  parameter;  /?.  is  the 

bounded  function  of  jh\  and  n  is  usually  the  number  of  observations.  BC  is  generally 
accomplished  in  two  steps: 

(1)  maximum  likelihood  estimates  are  calculated  and  (2)  the  estimates  are  corrected  up  to  the 
first-order  bias  term  by  subtracting  it  from  the  asymptotic  bias  of  the  maximum  likelihood 

estimates  of  / 3  (i.e.,  /?bc=  /?  —  bl ^°'>). 

Many  authors  discussed  the  BC  technique  subtraction  of  the  first-order  bias  term, 
especially  for  distributions  in  the  EFCL,  such  as  the  logistic  binomial  function.8’9  Nelder 

provided  a  formula  for  the  first  bias  tenn,  bl^°\  for  the  EFCL,  which  was  expressed  as 


where 


b-^-  =  {X^Xy^We 


(12) 


(13) 
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and  Qu  and  W  are  diagonal  elements  of  the  matrices 


q  =  x(xTwxy1xT 


(14) 


and 


(15) 


The  k2i  and  ka  are  the  variances  of  response  values  y  and  the  third-order 
cumulant10  for  the  Bernoulli  distribution,  which  are  7t(l  -  n)  and  7t(l  -  7i)(l  -  2n),  respectively. 
The  derivation  of  the  cumulant  generating  function  for  the  Bernoulli  distribution  is  presented  in 
Appendix  A.  1 .  The  dL\n  W is  related  to  eqs  3  and  4  and  the  probit 

(fax 

d‘  =  le=f^  (16) 

For  a  non-canonical  link  such  as  probit,  the  ka  and  kn  are  replaced  with  d[  and  di  because 
relations  /c2i  =  0d;  and  k3i  =  02d[  do  not  apply,  where  0  is  a  dispersion  parameter.  The  d[  for 
the  probit  is 


d2n 

d(P 


-(xiPmxw 


(17) 


Further  discussion  of  the  non-canonical  link  is  provided  in  Section  2.4.  The  JVs  for  the  logistic 
and  probit  binomials  are,  respectively,  the  following  diagonal  elements: 


and 


W  =  diag{7Tj(l  -7Tf)} 

(18) 

W  -  diagf  1  } 

(19) 

The  denominator  in  eq  19  is  the  inverse  of  the  variance  of  y  response  values,  multiplied  by  a 
square  of  the  normal  density  function.  The  numerator  comes  from  the  non-canonical  link  shown 
in  eq  4. 


The  BC  method  subtracts  first-order  bias  from  the  parameter  estimates,  which 
were  infinite  for  the  CS  and  quasi-CS  data.  Thus,  the  BC  method  fails  to  eliminate  the  initial 
problem  of  infinite  estimates. 

2.3  BR  Method  Derivation 

Firth  suggested  that  the  reduction  of  the  first-order  bias  for  the  EFCL  could  be 
performed  by  modifying  the  score  function,  [/*(/?)  =  [/(/?)  —  /(/?)£>(/?),  where  /(/?)  is  the 
information  matrix,  and  b(ft)  is  the  first-order  bias  in  the  maximum  likelihood  estimator,  /?.  '  In 
his  paper,  he  explained  that  this  method  shifts  the  score  function  downward  by  applying  the 
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information  matrix,  which  is  the  gradient  of  the  score  function.  Unlike  the  BC  method,  this 
method  does  not  depend  on  the  maximum  likelihood  estimate.  The  modified  score  function  for 
the  EFCL  can  be  expressed  as  U*  —  U  —  XTWe,  if  we  apply  the  first-order  bias  term  in  eq  12  to 
the  information  matrix,  which  will  be  explained  in  detail  in  Section  2.4  and  Appendix  A. 2. 

A  general  fonn  of  MMLE  for  the  canonical  link  is3 


(t  =  1 


(20) 


where  hi  is  a  diagonal  element 


h  =  w^x{xrwxy1xTw^ 


(21) 


and  W is  given  in  eq  15. 

Firth  also  showed  a  relation  between  the  MMLE  and  the  PL  functions  for  the 

EFCL  for  the  first-order  bias  reduction  expressed  as  U (/?J*  =  U (/?;)  +  Mrace  [/(/?)  _1 

where  I(fi)-1  is  an  inverse  of  the  information  matrix  evaluated  at  [>.  The  solution  of  the  above 

1  1 

MMLE  is  the  stationary  points  for  L*(/?)*  =  L(/?)|/(/?)|2,  where  | / (/?)  | T  is  the  Jeffreys  invariant 
prior.11  Therefore,  the  same  BR  result  can  be  achieved  for  the  EFCL  by  applying  the  PL 
method.3 


The  BR  estimate  is  also  consistent  and  asymptotically  normal.  One  of  the 
properties  for  the  BR  method  for  the  EFCL  is  that  it  always  has  finite  estimates,  even  though  the 
traditional  MLE  has  infinite  values.3 

2.4  BR  Method  Derivation  for  the  Non-Canonical  Model 

A  general  link  form  of  a  modified  score  function  for  the  non-canonical  model 
may  be  expressed  as12 


l 


where  t  indexes  parameters  1  and  i  is  the  number  of  sample  sizes.  The  dL,  d[,  and  hi  are 
given  in  eqs  16,  17,  and  21.  The  difference  between  the  canonical  and  the  non-canonical  models 
is  that  the  latter  involves  the  covariance  between  the  first-  and  second-order  derivatives  of  a 
likelihood  function.6 
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3. 


ESTIMATION  METHOD 


3.1  Estimation  Method  with  Newton  Raphson  Iteration 

A  first-order  Taylor-series  expansion  for  the  modified  score  function  about  the 
true  unknown  parameters,  fh,  may  be  expressed  as13 

0  =  u*(p)  *  tr (/? o)  -  KP o)(P  -  Po)  (23) 

where  /(/?0)  is  an  infonnation  matrix.  From  eq  23, 

P  ~  Po  +  {KPo^U^Po)  (24) 


and  final  parameter  estimates  can  be  obtained  by  updating  initial  estimates  until  the  estimates 
converge  to  certain  criteria.  In  this  approach,  the  values  for  flo  are  replaced  with  initial  starting 
values: 


Ps+1  =Ps  +  {r\ps)un 


(25) 


If  the  relevant  eqs  7,  16,  17,  and  21  are  used  in  eq  22  for  the  probit,  the  MMLE  model  can  be 
written  as 


tj*  _  y  _ f(xiP') _ r  _ —  h  F(.xiP')(.xiP')') 

1  ~  U  F(X;/?)(l-F(*i/?))  Ui  2  1  f(xiP) 


F(Xip)}xiit  t=\,...,p 


(26) 


Matrices  for  an  information  matrix  and  a  modified  score  function  can  be  formed 
from  eqs  10  and  26,  which  would  then  be  used  for  the  iterative  reweighted  estimation  process  in 
eq  24.  A  variance-covariance  matrix  can  be  obtained  from  an  inversion  of  the  infonnation  matrix 
expressed  in  eq  10. 

3.2  Initial  Estimates  for  the  Newton  Raphson  Procedure 

Finding  suitable  initial  estimates  is  a  key  factor  in  getting  converged  estimates  in 
eq  24;  otherwise,  estimates  may  go  out  of  range  for  a  nonnal  density.  A  cumulative  normal 
distribution  may  not  generate  proper  values  in  eq  26. 

There  are  several  ways  to  generate  suitable  initial  estimates,  such  as  by 
augmenting  data  for  the  two-response  values  (0  and  1)  to  make  pseudo-overlapped  data  or  by 
dividing  large  estimated  values  from  the  traditional  MLE  by  a  constant  that  does  not  generate 
extreme  values  in  eq  26.  Estimates  from  an  ordinary  regression  can  also  be  used  if  the  GLM 
function  is  not  available.  Alternatively,  the  simplest  way  is  to  set  all  the  initial  parameters  at  0, 
thus  generating  0.5  and  0.40  for  the  cumulative  nonnal  distributions  in  eq  5  and  the  density 
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function  in  eq  8,  respectively.  The  last  method  may  be  preferable  over  methods  for  other 
platforms  because  it  does  not  require  any  calculations. 


Comparisons  of  parameter  estimations  and  variances  between  the  BR  and 
traditional  MLE  methods  for  the  three  different  data  sets  that  were  shown  in  Figures  1-3  are 
provided  in  Tables  1-3.  Comparisons  of  LD50  and  Cl  values  are  provided  in  Table  4-6. 


Table  1.  Comparisons  of  Parameter  Estimates  and  Standard  Errors  between  BR  and 
_ Traditional  MLE  Methods  for  Figure  1 _ 


Parameter 

Estimate 

Standard  Error 

BR 

Traditional 

BR 

Traditional 

Intercept 

-6.95 

-102.07 

3.74 

95865 

Slope  for  Dose 

5.20 

73.52 

2.83 

69511 

Table  2.  Comparisons  of  LD50  and  Cl  Values  between  BR  and  Traditional  MLE  Methods 


from  1 

fable  1 

Method 

LD50 

Standard  Error 

Cl 

BR 

21.68 

0.10 

(13.68,34.37) 

Traditional  MLE 

24.45 

83.14 

(-00*,  oo) 

*oo  denotes  infinity. 


Table  3.  Comparisons  of  Parameter  Estimates  and  Standard  Errors  between  BR  and 
_ Traditional  MLE  Methods  for  Figure  2 _ 


Parameter 

Estimate 

Standard  Error 

BR 

Traditional 

BR 

Traditional 

Intercept 

-5.64 

-60.03 

3.62 

12250 

Slope  for  Dose 

11.43 

120.34 

6.94 

24381 

Table  4.  Comparisons  of  LD50  and  Cl  Values  between  BR  and  Traditional  MLE  Methods 


from  1 

fable  3 

Method 

LD50 

Standard  Error 

Cl 

BR 

3.12 

0.05 

(2.54,  3.82) 

Traditional  MLE 

3.15 

0.73 

(0.12,  83.22) 

Table  5.  Comparisons  of  Parameter  Estimates  and  Standard  Errors  between  BR  and 
_ Traditional  MLE  Methods  for  Figure  3 _ 


Parameter 

Estimate 

Standard  Error 

BR 

Traditional 

BR 

Traditional 

Intercept 

-8.01 

-9.92 

3.24 

3.87 

Slope  for  Dose 

5.47 

6.73 

2.14 

2.55 
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Table  6.  Comparisons  of  LD50  and  Cl  Values  between  BR  and  Traditional  MLE  Methods 

from  Table  5 


Method 

LD50 

Standard  Error 

Cl 

BR 

29.05 

0.05 

(23.87,  35.34) 

Traditional  MLE 

29.74 

0.04 

(25.26,35.01) 

4.  SUMMARY 

The  BR  method  is  a  variant  of  the  BC  method,  and,  unlike  the  latter,  it  does  not 
require  finite  estimates  in  removing  the  first-order  bias  term  in  the  asymptotic  expansion  of 
MLE.  For  the  EFCL,  the  BR  method  results  in  the  same  parameter  estimates  as  the  PL  method 
using  the  Jeffreys  invariant  prior.  It  also  generates  finite  estimates,  although  the  traditional  MLE 
produces  infinite  estimates. 

For  CS  or  quasi-CS  data,  the  BR  method  does  not  require  prior  information  like 
Bayesian  techniques,  which  may  be  difficult  in  many  different  situations.  Like  the  traditional 
MLE  method,  the  BR  estimator  has  consistency  and  asymptotic  normality.  A  variance- 
covariance  matrix  can  be  obtained  from  an  inversion  of  the  information  matrix  after  the 
reweighted  iterative  procedure  is  finished. 

Short  descriptions  of  formula  derivations  are  provided  because  a  detailed 
discussion  for  complicated  mathematical  theory  developments  is  not  the  focus  of  this  paper. 

R  codes  (Appendix  B)  are  provided  for  executing  the  BR  method,  which  uses  0 
intercept  and  0  slope  for  initial  estimates,  and  the  convergence  criteria  for  the  estimates  are  set  to 
1  x  1 0  5  in  the  codes  for  calculating  the  differences  between  the  two  successive  estimates.  The 
written  BR  method  is  the  same  as  that  in  the  binomial-response  GLM  program  in  R,  and  the 
estimated  values  are  matched  between  the  two. 

All  the  necessary  formulas  and  procedures  are  provided  to  execute  the  BR  method 
to  estimate  LD50  or  ED50  for  binomial  distribution  with  probit  link  for  CS  and  quasi-CS  data. 
These  formulas  are  useful  in  getting  estimates  for  different  computing  platforms,  which  do  not 
have  a  built-in  module  for  MMLE  of  GLM.  The  only  required  computing  capabilities  are 
inversion  and  multiplication  of  a  matrix  with  modules  capable  of  generating  values  from  a 
nonnal  density  function  and  a  cumulative  nonnal  distribution  function  (eqs  5  and  8, 
respectively). 


The  BR  method  is  recommended  as  an  alternative  to  applying  related  probit 
slopes,  which  have  been  used  historically  at  ECBC  for  CS  and  quasi-CS  data.  When  probit 
slopes  are  unavailable  and  when  sample  sizes  are  small,  the  BR  method  may  enable  previously 
unattainable  analyses. 
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ACRONYMS  AND  ABBREVIATIONS 


BC 

bias  correction 

BR 

bias  reduction 

Cl 

confidence  interval 

CS 

completely  separated 

DR 

dose  response 

ECBC 

U.S.  Army  Edgewood  Chemical  Biological  Center 

ED50 

50th  percentile  of  the  effective  dose 

EFCL 

exponential  family  with  canonical  link 

GLM 

generalized  linear  model 

HPAC 

Hazard  Prediction  and  Assessment  Capability 

LD50 

50th  percentile  of  the  lethal  dose 

MLE 

maximum  likelihood  estimation 

MMLE 

modified  maximum  likelihood  estimation 

PL 

penalized  log-likelihood 

quasi-CS 

quasi-completely  separated 

VLSTrack 

Vapor,  Liquid,  and  Solid  Tracking 
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APPENDIX  A 


DERIVATION  OF  CUMULANT  GENERATING  AND 
MODIFIED  MAXIMUM  LIKELIHOOD  ESTIMATION  FUNCTIONS 


A.l 


Derivation  of  the  Cumulant  Generating  Function  for  Bernoulli  Distribution 


g'(t) 


(Al) 


In  eq  Al,  ki  =  g'(0)  =p,  k2  =  g'(0)  =p(l—p).  It  can  be  calculated  from  the 
recursive  formula  in  eq  A2 


d.kr 


K+i  =p(i-p)^ 


(A2) 


A.2 


Derivation  of  Modified  Maximum  Likelihood  Estimation  (MMLE)  Function 


Because  a  general  form  of  a  score  function  for  exponential  form  is1 
U  —  XtWD~1{y  —  n ),  and  the  expected  infonnation  matrix  is  expressed  as  X'WX,  the  reduction 

of  the  first  term  is  /(/?)  x  bl —  =  (XTWX)  x  (XTWX)~1XTW  G,  where  W  is  derived  from 


W  = 


0 di ) 


d- 


,  and  e  for  a  general  form  is^  =  —  Qa  — .  The  d;,  and  d \  are  shown  by 

k2i  2  dj 

-1  vT.  rl  —  dn  — 


the  following  functions:  Q  —  X(X‘  WX )  1X‘ ;  —  —  —  /(xj;  and 

du 

d27i 

d[  —  —  —  (XiP)f(XiP),  respectively.  From  eqs  Al  and  A2,  the  MMLE  is  expressed  as 

l/t*  =  XTWD~1(y  —  n)  —  XTWe.  This  can  be  rewritten  as  l/t*  =  XTWD^1{y  —  De  —  n).  If  W, 
D,  and  e  are  used  as  substitutes  in  the  score  equation,  its  components  can  be  restated  as 

Ut  =  h  (Ji  +  \hA  -  TTi)xit. 

K-2  r  £  wi 


1  Kosmidis,  I.;  Firth,  D.  Bias  Reduction  in  Exponential  Family  Nonlinear  Models.  Biometrika  2009,  96 
(4),  793-804. 


?  McCullagh,  P.;  Nelder,  J.A.  Generalized  Linear  Model,  2nd  ed.;  Chapman  and  Hall:  London,  1989. 
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APPENDIX  B 


R  CODES  USED  FOR  ESTIMATING  AN  INTERCEPT  AND  A  SLOPE 
WITH  THE  BIAS-REDUCTION  METHOD 


function(data) 

{ 

#  This  is  finding  an  intercept  and  a  slope  for  probit  analysis  for  separated  data  for  one  covariate 
’dose'. 

#  The  solution  is  based  on  Ioannis  Kosmidis’s  Ph.D.  dissertation.* 

#  For  initial  estimates,  0  intercept  and  0  slope  are  used. 

#  First,  sort  the  dose  in  ascending  order. 
data<-data[order(data$dose),] 
dose<-data$dose 

log.dose<-log  1 0(dose) 

#  Get  a  length  of  data, 
data.  dim<-dim(data)  [  1  ] 
old.inter<- 10000 
old.sp<- 10000 

#  initial  estimates  of  an  intercept  and  a  slope 
inter<-0 

sp<-0 


while  (abs(old.inter-inter)>=0. 000005  &&  abs(old.sp-sp)>=0. 000005) 

{ 

resp<-data$resp 

#  To  add  an  intercept  term 
x.inter<-rep(l,data.dim) 
x.sp<-log.dose 

#  This  is  from  the  first  derivative 

w<-(dnorm(  inter+sp  *  x .  sp)/ (pnorm(  inter+sp  *  x .  sp)  *  ( 1  -pnorm(inter+sp  *  x .  sp)))) 

#  This  is  from  second  derivative. 

wl<-(dnorm(inter+sp*x.sp))A2/(pnorm(inter+sp*x.sp)*(l-pnonn(inter+sp*x.sp))) 

library(Matrix) 

#  This  is  for  H  matrix 

w  1  .diag<-Diagonal(data.dim,w  1 ) 
x.total<-cbind(x. inter, x.sp) 
t.x.total<-t(x.total) 

middle.  invert<-solve(t.x.  to  tal%*%wl.diag%*%x.  total) 

h<-sqrt(w  1  .diag)%*%x.total%*%middle.invert%*%t.x.total%*%sqrt(w  1  .diag) 

#  Now  pick  up  the  diagonal  elements  of  H  matrix. 
h.diag<-diag(h) 


*  Kosmidis,  I.  Bias  Reduction  in  Exponential  Family  Nonlinear  Models.  Ph.D.  Dissertation,  University  of  Warwick, 
Coventry,  U.K.,  2007. 
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#  This  is  for  an  intercept  in  the  first  derivative. 

#  ub  1  <- w * (resp-pnorm( inter+sp *  x .  sp)+h .  diag  *  (0 . 5 -pnorm( inter+sp *x.sp)))*x. inter 

#  This  is  from  R  score  adjusted  function  ubl<-w*(resp-0. 5*h. diag*(pnonn(inter+sp*x. sp)*(l- 
pnonn(inter+sp  *  x .  sp))  *  (inter+sp  *  x .  sp)/ dnorm(  inter+sp  *  x .  sp))-pnorm( inter+sp  *  x .  sp))  *  x .  inter 
ub  1  .sum<-sum(ub  1 ) 

ub2<-w  *  (resp-0 . 5  *  h.  diag  *  (pnorm(  inter+sp  *  x .  sp)  *  ( 1  - 

pnonn(inter+sp  *  x .  sp))  *  (inter+sp  *  x .  sp)/ dnonn(  inter+  sp  *  x .  sp))-pnorm( inter+sp  *  x .  sp))  *  x .  sp 
ub2.sum<-sum(ub2) 

ub.mat<-matrix(c(ub  1  .sum,ub2.sum),nrow=2) 

#  Generate  a  hessian  matrix, 
hel  l<-sum(wl*x.  inter) 

he  1 2<-sum(w  1  *x.inter*x.sp) 
he2  l<-sum(wl  *x.inter*x.sp) 
he22<-sum(wl  *x.spA2) 

he.mat<-matrix(c(he  1 1  ,he  1 2,he2 1  ,he22),nrow=2,byrow=T) 
new.mat.solve<-solve(he.mat)%*%ub.mat 
o  Id .  mat<-matrix(c(inter,  sp)  ,nro  w=2) 

old.inter<-old.mat[l,l] 

old.sp<-old.mat[2,l] 

#  Upgrade  to  new  estimates. 
new.mat<-old.mat+new.mat.  solve 
inter<-new.mat[  1,1] 
sp<-new.mat[2,l] 

} 

#  Final  estimates 

est«-matrix(c(inter,sp),byrow=T) 

#new.mat«-new.mat 

va.final<-solve(he.mat) 

#  To  retrieve  the  final  variances  matrix, 
va .  final«-va.  final 

} 
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